Real-time quality analyzer for voice and audio signals

ABSTRACT

A method for providing real-time perceptual quality measurements of an audio signal ( 12 ) in which a quality test signal, including an audio test signal, is received by equipment under test. Playback of a pre-stored representation of the audio signal is coarsely synchronized ( 20 ) to the received audio test signal, for example, utilizing a synchronizing pulse in a header of the quality test signal. The playback is then finely synchronized ( 24 ) to the received audio signal, for example, by comparing data in a windowed portion of the received audio test signal and a windowed portion of the pre-stored representation of the audio test signal and by adjusting a windowed portion of the pre-stored representation of the audio test signal in accordance with results of the comparison. A window of the received audio test singal is then compared ( 14 ) to a portion of the finely synchronized play back of the pre-stored representation of the audio test signal to output a quality measurement of the received audio test signal.

BACKGROUND OF THE INVENTION

This invention relates to methods and apparatus for providing qualitymeasurements for voice equipment under test, and more particularly tomethods and apparatus for providing real-time objective perceptualquality measurements of voice or audio signals received by suchequipment.

Voice quality evaluation is a difficult task for speech systems,especially those involving compression and coding, because commonwaveform and spectrum similarity criterion do not correlate particularlywell with perceived quality of received voice signals. Formerly, voicequality evaluation of telecommunication systems have been measuredoff-line using formal perceptual listening tests that are performed in acarefully controlled environment, using pre-prepared voice material.Although this practice is effective, it is both costly and timeconsuming. In addition, results obtained from such tests are dependentupon the individual test subjects and their environment. As a result,findings from such tests are not always repeatable or consistent.

Recent research in the field of psycho-acoustics has led to a betterunderstanding of how human beings perceive voice and sounds. By applyingsome of the findings of this field, such as critical band theory,auditory masking, and perceptual loudness, etc., it is now possible todevelop “objective” speech measures that closely match results of formalsubjective listening tests. Various organizations, including, forexample, the International Telecommunications Union (ITU), havedeveloped algorithms to measure voice quality off-line using filesstored in a computer. Examples of known objective measurement algorithmsare Perceptual Speech Quality Measure (PSQM), Measuring NormalizingBlocks (MNB), Perceptual Analysis Measurement System (PAMS), andModified Bark Spectral Distortion (MBSD) measure. The lattermeasurement, for example, splits frequencies into bands that reflecthuman auditory reception.

Known objective perceptual quality measurement systems requiremeasurement of voice quality to be done off-line, i.e., from stored,received voice data. It would be desirable if such objective perceptualquality measurements could be made in real time or near real time inoperational equipment.

BRIEF DESCRIPTION OF THE INVENTION

The present invention, in one aspect, is a method for providingreal-time perceptual quality measurements of an audio signal. A qualitytest signal, including an audio test signal, is received by equipmentunder test. Playback of a pre-stored representation of the audio signalis coarsely synchronized to the received audio test signal, for example,utilizing a synchronizing pulse in a header of the quality test signal.The playback is then finely synchronized to the received audio signal,for example, by comparing data in a windowed portion of the receivedaudio test signal and a windowed portion of the pre-storedrepresentation of the audio test signal and by adjusting a windowedportion of the pre-stored representation of the audio test signal inaccordance with results of the comparison. A window of the receivedaudio test signal is then compared to a portion of the finelysynchronized playback of the pre-stored representation of the audio testsignal to output a quality measurement of the received audio testsignal.

In another aspect, the invention comprises an audio quality analyzer(AQA) to evaluate quality of a quality test signal received by equipmentunder test, where the quality test signal includes an audio test signal.The AQA is configured to coarsely synchronize playback of a pre-storedrepresentation of the audio test signal to the received audio testsignal, to finely synchronize playback of said pre-stored representationof the audio test signal to the received audio test signal and tocompare a window of the received audio test signal to a portion of thefinely synchronized playback of the pre-stored representation of theaudio test signal to output a quality measurement of the received audiotest signal.

Accordingly, it will be seen that the invention provides objectiveperceptual quality measurements of audio and voice signals in real timeor near real time in operational equipment.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of an embodiment of a voice quality analyzerin accordance with the invention.

FIG. 2 is a diagram of a quality test message frame.

FIG. 3 is a diagram of another embodiment of a voice quality analyzer inaccordance with the invention.

FIG. 4 is a block diagram of an embodiment of buffer providing syncwindowing and selection windowing in accordance with the invention.

FIG. 5 is a drawing representing a rectangular window function shape.

FIG. 6 is a drawing representing a nonlinear emphasized window functionshape.

FIG. 7 is a drawing representing a discontinuous rectangular windowfunction.

FIG. 8 is a block diagram of a test configuration in accordance with theinvention.

FIG. 9 is a flow chart of an embodiment of a test method in accordancewith the invention.

DETAILED DESCRIPTION OF THE INVENTION

FIG. 1 is a block diagram of a voice quality analyzer (VQA) 10 thatreceives a voice signal output by voice equipment under test (VEUT) 12.VQA comprises a quality evaluator 14 that generates a qualitymeasurement of voice test signals received from VEUT 12. VQA 10 alsocomprises a header detector 16 which, in turn, comprises a dual tonemultiple frequency (DTMF) detector 18 and a sequencer 20. DTMF detector18 monitors signals received from VEUT 12 to detect and decode signalingtones present in the received signals. The decoded signals are used bysequencer 20 to control operation of a voice sentence generator 22.

Pre-stored representations of voice test signals are stored in voicesentence generator 22. Such “sentences” may, but do not necessarilyrepresent full sentences or words in any particular language, nor dothey necessarily represent speech from any particular human. Rather, therepresentations are selected for facilitating the voice qualitymeasurement performed by quality evaluator 14. When a header signalpreceding a voice test signal is received, sequencer 20 initiatesplayback of a particular pre-stored voice test signal representationfrom voice sentence generator 22, depending upon a particular voice testsignal that is identified in the header. To achieve synchronizationbetween the pre-stored representation of a voice test signal and thereceived voice test signal sufficient to perform an objective perceptualquality comparison utilizing quality evaluator 14, a fine synchronizer24 is provided. Voice quality measurement is performed by applying anobjective perceptual quality measurement algorithm to compare the aportion of the synchronized, locally generated reference signal fromfine synchronizer 24 to a windowed portion of the signal received fromVEUT 12. In one embodiment, one of the following algorithms is used:Perceptual Speech Quality Measure (PSQM), Measuring Normalizing Blocks(MNB), Perceptual Analysis Measurement System (PAMS), and Modified BarkSpectral Distortion (MBSD) measure. In another embodiment, a pluralityof different algorithms is available, and an algorithm selection is mademanually. In another embodiment (not shown), a plurality of differentalgorithms are available, and a selection is made dependent upon whichpre-stored representation in voice sentence generator 22 is selected bysequencer 20.

In one embodiment and referring to FIG. 2, an example of a quality testmessage 30 is shown. Quality test message 30 includes four sections 32,34, 36, 38, of which three, 32, 34, and 36, comprise a header 40 that istransmitted utilizing DTMF signaling, and a fourth includes a voice testmessage 38. Unique word 32 is used to signal the start of a new qualitytest message 30. Unique word 32 is included to prevent false measurementstart signals during periods of severe channel degradation, for example,periods of very noisy reception by VEUT 12 of signals from a cellularnetwork. Sentence ID 34 includes an index number or identifier of voicetest message 38, thereby permitting different test messages to betransmitted to VEUT 12 and identified by VQA 10. Sync pulse 36 is ashort DTMF pulse that is used to signal the start of voice test signal38. Sync pulse 36 is used by sequencer 20 to start voice sentencegenerator 22 playing the appropriate pre-stored voice test signalrepresentation for comparison with that received by VEUT 12. In otherembodiments, header 40 is transmitted in another manner, for example,using another form of in-band signaling, or by using out-of-bandsignaling. In these other embodiments, means other than DTMF detector 18are used to detect and respond to header 40. Examples of suitablein-band signaling include monotone signaling and telephony dataprotocol. An example of suitable out-of-band signaling is signaling on aseparate paging channel.

In one embodiment and referring to FIG. 3, sequencer 20 includes aunique word detector 42, a sentence ID detector 44, and a coarse syncdetector 46, which include the functions of DTMF detector 18 of FIG. 1.Therefore, no separate DTMF detector 18 is shown in FIG. 3. When aunique word 32 is recognized by unique word detector 42, subsequentlyreceived data is passed to sentence ID detector 44. Sentence ID detector44 detects sentence ID 34 that is received after the unique word. Whensentence ID 34 is identified, it is passed to voice sentence generator22 so that it can output the proper pre-stored representation of a voicetest signal corresponding to a voice test signal identified by sentenceID 34, and subsequently received data is passed to coarse sync detector46. Coarse sync detector 46 detects sync pulse 36 which, in oneembodiment, is coded as a short DTMF pulse. When a coarse sync signalfrom coarse sync detector 46 is received, voice sentence generator 22begins playback of a pre-stored representation of a voice signalcorresponding to the determined sentence ID 34.

In one embodiment, the coarse synchronization provided by sync pulse 36is not sufficient to enable signal comparator 14 to compare a voice testsignal 38 to a pre-stored representation of a voice signal in real time,i.e., so that the quality evaluations performed by signal comparator 14occur during receipt of voice test signals 38 with little or no apparentdelay as perceived by a user. In one embodiment, coarse synchronizationis not sufficient for analyzing voice test signals 38 using PerceptualSpeech Quality Measure (PSQM), Measuring Normalizing Blocks (MNB),Perceptual Analysis Measurement System (PAMS), and Modified BarkSpectral Distortion (MBSD) measure algorithms. Therefore, a fine syncdetector 24 is provided for more accurate synchronization. Fine syncdetector 24 compares the output of voice sentence generator 22 with awindow of voice data selected by sync windowing module 52. Thiscomparison is performed, in one embodiment, in accordance withInternational Telecommunications Union (ITU) standard P.931, “MultimediaCommunications Delay, Synchronization and Frame Rate Measurement.” As aresult of this comparison, outputs of fine sync detector 24 are producedto control a switch 54, which is closed when fine synchronization isachieved. Switch 54 prevents quality evaluations from being outputbefore fine synchronization is achieved. In addition, data windowsrepresenting synchronized portions of a pre-stored representation of avoice test signal are output to a selection windowing module 56.Selection windowing module 56 selects a synchronized portion of theincoming voice test data 58 to compare to the synchronized portions ofthe pre-stored representation 60. The comparison is performed byperceptual comparator 14, and a quality evaluation is produced. Thequality evaluation is output when switch 54 is closed, as indicatedabove.

FIG. 4 is a drawing of a representation of the windowing operation ofsync window module 52 and selection windowing module 56 in oneembodiment of the invention. A sync window 62 is selected from buffer 48by sync window module 52. The start of sync window 62 and a selectionwindow 64 selected by selection windowing module 56 are aligned. Buffer48 is a circular buffer accepting digitized voice input. The position ofsync window 62 is adjusted in accordance with quality measurements madeby perceptual comparator 14, as indicated in FIG. 3. Alignment ofselection window 64 with sync window 62 is accomplished, in thisembodiment, by fine sync detector 24, including by selection of windoweddata output from voice sentence generator 22.

In the embodiment represented by FIG. 3, selection windowing module 52also applies a window function to at least one of the received voicedata and the pre-stored representation of voice test signals for dataweighting. In one embodiment, a plurality of weighting functions areprovided, including rectangular weighting, as represented in FIG. 5,nonlinear emphasized weighting, an example of which is represented inFIG. 6, and discontinuous rectangular weighting, an example of which isrepresented in FIG. 7. The selection of the weighting function ispreselected, through selection of a quality algorithm. The selection isalso adaptively alterable, in accordance with a quality measurement fromperceptual comparator 14 and as indicated in FIG. 3. Discontinuousrectangular weighting is used, for example, when disturbances such ashand-offs in a cellular system interfere with reception of voice signaldata. In this case, in one embodiment, the algorithm used by perceptualcomparator 14 excludes the disturbed periods from the qualityevaluation. The occurrence and length of disturbed periods, in oneembodiment, is reported separately from the quality measurement.

An embodiment of a test configuration in accordance with the inventionis shown in FIG. 8. It will be recognized that many or all of thefunctional elements in VQA 10 can be implemented in software or firmwarein a computer as a design choice; accordingly, VQA 10 is shown as acomputer in FIG. 8. VQA 10 is connected to an output port of VEUT 12,which, in one embodiment, is a cellular telephone 12 with a “hands-free”port. In this manner, quality test messages 30 received by cellulartelephone 12 are transmitted to VQA 10 for analysis. Cellular telephone12 receives quality test messages 30 from a message source 66, forexample, via a network 68 such as a cellular wireless network. In oneembodiment, message source 66 is configured as an answering machine withrecorded quality test messages 30 stored in voice mailboxes. Therecorded quality test messages 30 in the voice mailboxes are identifiedwith sentence IDs 34. Voice test signals 38 stored in message source 66are identified with sentence IDs 34 that identify correspondingpre-stored representations of voice test messages in voice sentencegenerator 22 of VQA 10.

In one embodiment and referring the FIG. 9, VEUT 12 dials 100 messagesource 66 via network 68 and retrieves 102 a voice mail messagetherefrom. The retrieved voice mail message is a quality test message30. VQA 10 then waits 104, 106 until unique word 32 is recognized. Next,sentence ID 34 is obtained 108. VQA 10 then waits until sync pulse 36 isreceived 110, 112. When sync pulse 36 is received, a local copy of voicetest signal 38 is retrieved 114, for example, from voice sentencegenerator 22. Fine synchronization 116 of the local copy of voice testsignal 38 is then performed, and a voice quality measure is computed 118until it is determined 120 that voice test signal 38 has ended. Whenvoice test signal 38 has ended, the computed quality is displayed 122,and the end of the test is reached 124. In other embodiments, qualitytests may be repeated manually or automatically.

Those skilled in the art will recognize that the invention describedherein provide real-time perceptual quality measurement of voicesignals. The invention is particularly suitable for performing suchmeasurements utilizing algorithms that have previously not been known tobe suitable for real-time measurements of signals. The invention is alsoparticularly suited for providing real time perceptual qualitymeasurements when a highly compressed voice signal is transmitted.Although embodiments described herein are applicable to qualitymeasurements of voice signals, it will be recognized that the inventionis also suitable for quality measurements of non-voice audio testsignals as well. In these embodiments, voice quality analyzer 10 isthus, more generally, an audio quality analyzer (AQA), voice test signal38 is an audio test signal, voice sentence generator 22 is an audiowaveform generator (such as a digitized waveform generator), and thepre-stored representations of voice test signals in the audio waveformgenerator are pre-stored representations of audio test signals.

It will be evident to those skilled in the art that many othermodifications are possible within the spirit of the invention.Therefore, the scope of the invention should be determined by referenceto the claims appended below and their equivalents.

What is claimed is:
 1. A method for providing real-time perceptualquality measurement of an audio signal, comprising: receiving a qualitytest signal, including receiving an audio test signal; coarselysynchronizing playback of a pre-stored representation of the audio testsignal to the received audio test signal; finely synchronizing playbackof the pre-stored representation of the audio test signal to thereceived audio test signal; and comparing a window of the received audiotest signal to a portion of the finely synchronized playback of thepre-stored representation of the audio test signal to output a qualitymeasurement of the received audio test signal.
 2. A method in accordancewith claim 1 wherein the quality test signal comprises a header signalincluding a synchronization pulse, and the step of coarselysynchronizing playback of the pre-stored representation of the audiotest signal to the received audio test signal comprises synchronizingplayback of the pre-stored representation of the audio test signalutilizing the synchronization pulse.
 3. A method in accordance withclaim 2 wherein finely synchronizing playback of the pre-storedrepresentation of the audio test signal to the received audio testsignal comprises the step of: comparing data in a windowed portion ofthe received audio test signal and a windowed portion of the pre-storedrepresentation of the audio test signal; and adjusting an alignment ofthe windowed portions of the received audio test signal and a windowedportion of the pre-stored representation of the audio test signal inaccordance with results of the comparison.
 4. A method in accordancewith claim 3 further comprising the step of receiving the header signalout-of-band.
 5. A method in accordance with claim 3 further comprisingthe step of receiving the header signal in-band.
 6. A method inaccordance with claim 5 wherein the step of receiving the header signalcomprises the step of receiving dual tone multiple frequency (DTMF)tones, and the step of coarsely synchronizing playback of the pre-storedrepresentation of the audio test signal comprises synchronizing playbackof the pre-stored representation of the audio test signal to a DTMFpulse.
 7. A method in accordance with claim 3 wherein the audio testsignal is a voice test signal, and the pre-stored representation of theaudio test signal is a pre-stored representation of the voice testsignal.
 8. A method in accordance with claim 7 wherein and furthercomprising the steps of: receiving a sentence ID identifying thereceived voice test signal; and selecting the pre-stored representationof the voice test signal from a plurality of pre-stored representationsin accordance with the received sentence ID.
 9. A method in accordancewith claim 8 wherein the step of receiving a sentence ID identifying thereceived voice signal comprises the step of receiving dual tone multiplefrequency (DTMF) tones identifying the received voice signal.
 10. Amethod in accordance with claim 3 wherein comparing a window of thereceived audio test signal to a portion of the finely synchronizedplayback of the pre-stored representation of the audio test signal tooutput a quality measurement of the received audio test signal comprisesthe step of generating a quality measurement in accordance with at leastone quality measurement algorithm selected from the group of qualitymeasurements consisting of ITU P.861 perceptual speech qualitymeasurement (PSQM), modified normalized block (MNB), modified Barkspectral distortion (MSBD) measure, and perceptual analysis measurementsystem (PAMS).
 11. A method in accordance with claim 10 furthercomprising the steps of: receiving a sentence ID in the header signal;and selecting a quality measurement algorithm for generating the qualitymeasurement in accordance with the received sentence ID.
 12. A method inaccordance with claim 3 further comprising the steps of: receiving aunique word transmitted in the header signal; and verifying that theunique word was received before outputting a quality measurement of thereceived audio test signal.
 13. A method in accordance with claim 12wherein receiving a unique word comprises the step of receiving a dualtone multiple frequency (DTMF) signal representing a unique word.
 14. Amethod in accordance with claim 1 further comprising applying awindowing function to a window of at least one of the window of thereceived audio test signal and a window of the finely synchronizedpre-stored representation of the audio test signal prior to comparingthe windowed portions to generate the quality measurement.
 15. A methodin accordance with claim 14 wherein the step of applying a windowingfunction comprises preselecting a windowing function.
 16. A method inaccordance with claim 15 wherein the step of applying a windowingfunction comprises adaptively selecting a windowing function.
 17. Anaudio quality analyzer (AQA) to evaluate quality of a quality testsignal received by equipment under test, the quality test signalcomprising an audio test signal, said AQA configured to: coarselysynchronize playback of a pre-stored representation of the audio testsignal to the received audio test signal; finely synchronize playback ofsaid pre-stored representation of the audio test signal to the receivedaudio test signal; and compare a window of the received audio testsignal to a portion of the finely synchronized playback of thepre-stored representation of the audio test signal to output a qualitymeasurement of the received audio test signal.
 18. An AQA in accordancewith claim 17 wherein the quality test signal comprises asynchronization pulse, and said AQA is configured to coarselysynchronize playback of the pre-stored representation of the audio testsignal to the received audio test signal utilizing the synchronizationpulse.
 19. An AQA in accordance with claim 18 wherein said AQA isconfigured to: compare data in a windowed portion of the received audiotest signal and a windowed portion of the pre-stored representation ofthe audio test signal; and adjust an alignment of the windowed portionsof the received audio test signal and a windowed portion of thepre-stored representation of the audio test signal in accordance withresults of the comparison.
 20. An AQA in accordance with claim 19further configured to receive the header signal out-of-band.
 21. An AQAin accordance with claim 19 further configured to receive the headersignal in-band.
 22. An AQA in accordance with claim 21 furtherconfigured to receive dual tone multiple frequency (DTMF) signals as theheader signal, and to coarsely synchronize playback of the pre-storedrepresentation of the audio test signal to a DTMF pulse.
 23. An AQA inaccordance with claim 19 wherein the audio test signal is a voice testsignal, and the pre-stored representation of the audio test signal is apre-stored representation of a voice test signal.
 24. An AQA inaccordance with claim 23 further configured to: receive a sentence IDidentifying the received voice test signal; and select the pre-storerepresentation of the voice test signal from a plurality of pre-storedrepresentations in accordance with the received sentence ID.
 25. An AQAin accordance with claim 24 further configured to receive dual tonemultiple frequency (DTMF) signals as the sentence ID.
 26. An AQA inaccordance with claim 19 configured to generate a quality measurement inaccordance with at least one quality measurement algorithm selected fromthe group of quality measurement algorithms consisting of ITU P.861perceptual speech quality measurement (PSQM), modified normalized block(MNB), modified Bark spectral distortion (MSBD) measure, and perceptualanalysis measurement system (PAMS).
 27. An AQA in accordance with claim26 further configured to: receive a sentence ID in the header signal;and select a quality measurement algorithm for generating the qualitymeasurement in accordance with the received sentence ID.
 28. An AQA inaccordance with claim 19 further configured to: receive a unique wordtransmitted in the header signal; and verify that the unique word wasreceived before outputting a quality measurement of the received audiotest signal.
 29. An AQA in accordance with claim 28 further configuredto receive a dual tone multiple frequency (DTMF) signal representing theunique word.
 30. An AQA in accordance with claim 19 further configuredto apply a windowing function to a window of at least one of thewindowed portion of the received audio test signal and a windowedportion of the finely synchronized pre-stored representation of theaudio test signal prior to comparing them to generate the qualitymeasurement of the received audio test signal.
 31. An AQA in accordancewith claim 30 further configured to apply a preselected windowingfunction.
 32. An AQA in accordance with claim 31 further configured toadaptively apply a windowing function.